Comments on "Incremental Construction and Maintenance of Minimal Finite-State Automata, " by Rafael C. Carrasco and Mikel L. Forcada

نویسنده

  • Jan Daciuk
چکیده

In a recent article, Carrasco and Forcada (June 2002) presented two algorithms: one for incremental addition of strings to the language of a minimal, deterministic, cyclic automaton, and one for incremental removal of strings from the automaton. The first algorithm is a generalization of the “algorithm for unsorted data”—the second of the two incremental algorithms for construction of minimal, deterministic, acyclic automata presented in Daciuk et al. (2000). We show that the other algorithm in the older article—the “algorithm for sorted data”—can be generalized in a similar way. The new algorithm is faster than the algorithm for addition of strings presented in Carrasco and Forcada’s article, as it handles each state only once.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Incremental Construction and Maintenance of Minimal Finite-State Automata

Daciuk et al. [Computational Linguistics 26(1):3–16 (2000)] describe a method for constructing incrementally minimal, deterministic, acyclic finite-state automata (dictionaries) from sets of strings. But acyclic finite-state automata have limitations: For instance, if one wants a linguistic application to accept all possible integer numbers or Internet addresses, the corresponding finitestate a...

متن کامل

Incremental construction and maintenance of morphological analysers based on augmented letter transducers

We define deterministic augmented letter transducers (DALTs), a class of finitestate transducers which provide an efficient way of implementing morphological analysers which tokenize their input (i.e., divide texts in tokens or words) as they analyse it, and show how these morphological analysers may be maintained (i.e., how surface form–lexical form transductions may be added or removed from t...

متن کامل

An Implementation of Deterministic Tree Automata Minimization

A frontier-to-root deterministic finite-state tree automaton (DTA) can be used as a compact data structure to store collections of unranked ordered trees. DTAs are usually sparser than string automata, as most transitions are undefined and therefore, special care must be taken in order to minimize them efficiently. However, it is difficult to find simple and detailed descriptions of the minimiz...

متن کامل

Unsupervised Training of a Finite-State Sliding-Window Part-of-Speech Tagger

A simple, robust sliding-window part-of-speech tagger is presented and a method is given to estimate its parameters from an untagged corpus. Its performance is compared to a standard Baum-Welchtrained hidden-Markov-model part-of-speech tagger. Transformation into a finite-state machine —behaving exactly as the tagger itself— is demonstrated.

متن کامل

Efficient encodings of finite automata in discrete-time recurrent neural networks∗

A number of researchers have used discretetime recurrent neural nets (DTRNN) to learn finite-state machines (FSM) from samples of input and output strings; trained DTRNN usually show FSM behaviour for strings up to a certain length, but not beyond; this is usually called instability. Other authors have shown that DTRNN may actually behave as FSM for strings of any length and have devised strate...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Computational Linguistics

دوره 30  شماره 

صفحات  -

تاریخ انتشار 2004